One of the oldest known ciphers is known as Caesar's cipher. Julius Caesar encrypted his messages by shifting every letter of the alphabet three spaces forward and looping back when the end of the alphabet is reached. Consequently, A
would be mapped to D
and Z
would be mapped to C
.
An immediate problem with this cipher is the lack of a key - the shift amount is always the same. A natural extension of the cipher would then be to let the shift amount vary, turning it into a key whose possible values are the numbers between 0 and 25. Therefore, the key space is
An encryption algorithm
The notation
It is only natural to now ask, is this cipher secure? And the simple answer is no. There are only 26 possible keys and so the key-space is not sufficiently big. You can even go through all 26 possible keys with a given ciphertext by hand and check which resulting plaintext makes sense. Most likely, there will be only one and so you would have recovered the original message.
Another method to crack this cipher is by using frequency analysis. Since the shift cipher is a one-to-one mapping on a letter-by-letter basis, the frequency distribution of letters is preserved. For example, the most common letter in English is the letter "e". If we analyse the ciphertext and discover that the most common letter there is "g", then we know that most likely the letter "g" is the letter "e" encrypted with the given key. From this we can calculate the key to be 2 (however, the plaintext, and therefore the ciphertext, may actually deviate from this distribution, so it is not with 100% certainty that the key is 2). We can also perform the same procedure with the rest of the letters in the ciphertext and retrieve the original plaintext. This process can also be automated with some math.
Let's once again map the alphabet with the integers 0 through 25 and also this time let
Now, let
for every value of
This cipher is a more advanced version of the shift cipher. It is a poly-alphabetic shift cipher. Unlike the previous ciphers, it does not define a fixed mapping on a letter-by-letter basis. Instead, it maps blocks of letters whose size depends on the key length. For example, ab
could be mapped to xy
, ac
to zt
, and aa
to bc
. Moreover, identical blocks will be mapped to different blocks depending on their relative position in the plaintext. ab
could once be mapped to xy
, but then when ab
appears again, it may be mapped to ci
.
In the Vigenère cipher the key is no longer a single number, but rather a string of letters, where each letter is again mapped to the integers
Plaintext: the golden sun shone brightly, bathing the beach in its warm sunlight
Key: cok ecokec oke cokec okecokec, okecoke cok ecoke co kec okec okecokec
Ciphertext: vvo kqznip ger uvyrg pbmivdpa, pkxjwxk vvo fgoml kb sxu kkvo gernwqlv
Given a known key length, also called a period, theg
and olde
, t
and o
would have both been encrypted with c
, h
and l
with o
and so on. Such characters are said to comprise a stream. Stated in a more mathematical way, for all
If the period vvo
s is 32 which is 8 times the period 4.
There is also a more automatable (if this is even a word) approach. Recall that, given a period
If we let
Referring back to previous analysis, we get that
When